Overview

Dataset statistics

Number of variables35
Number of observations121856
Missing cells0
Missing cells (%)0.0%
Duplicate rows3535
Duplicate rows (%)2.9%
Total size in memory32.5 MiB
Average record size in memory280.0 B

Variable types

NUM15
BOOL10
CAT10

Reproduction

Analysis started2021-11-27 07:07:24.688711
Analysis finished2021-11-27 07:08:44.787137
Duration1 minute and 20.1 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Dataset has 3535 (2.9%) duplicate rows Duplicates
Type_Organization has a high cardinality: 58 distinct values High cardinality
Client_Income is highly skewed (γ1 = 37.70687551) Skewed
Population_Region_Relative is highly skewed (γ1 = 246.4131264) Skewed
Child_Count has 86472 (71.0%) zeros Zeros
Application_Process_Day has 6287 (5.2%) zeros Zeros
Phone_Change has 14555 (11.9%) zeros Zeros
Credit_Bureau has 28003 (23.0%) zeros Zeros

Variables

Client_Income
Real number (ℝ≥0)

SKEWED

Distinct count1216
Unique (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16791.917343749996
Minimum2565.0
Maximum1800009.0
Zeros0
Zeros (%)0.0%
Memory size952.0 KiB

Quantile statistics

Minimum2565
5-th percentile6750
Q111250
median14400
Q320250
95-th percentile33300
Maximum1800009
Range1797444
Interquartile range (IQR)9000

Descriptive statistics

Standard deviation11373.08972
Coefficient of variation (CV)0.6772954804
Kurtosis5148.16906
Mean16791.91734
Median Absolute Deviation (MAD)4050
Skewness37.70687551
Sum2046195880
Variance129347169.9
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
135001371711.3%
 
11250119409.8%
 
15750101468.3%
 
1800095147.8%
 
900087057.1%
 
2250079756.5%
 
2025063095.2%
 
1440048334.0%
 
675041903.4%
 
2700041473.4%
 
Other values (1206)4038033.1%
 
ValueCountFrequency (%) 
25651< 0.1%
 
26101< 0.1%
 
26461< 0.1%
 
270025< 0.1%
 
27905< 0.1%
 
ValueCountFrequency (%) 
18000091< 0.1%
 
6750001< 0.1%
 
4500003< 0.1%
 
395005.951< 0.1%
 
3825001< 0.1%
 

Car_Owned
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
0
81305
1
40551
ValueCountFrequency (%) 
08130566.7%
 
14055133.3%
 

Bike_Owned
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
0
82572
1
39284
ValueCountFrequency (%) 
08257267.8%
 
13928432.2%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
0
62843
1
59013
ValueCountFrequency (%) 
06284351.6%
 
15901348.4%
 

House_Own
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
1
85459
0
36397
ValueCountFrequency (%) 
18545970.1%
 
03639729.9%
 

Child_Count
Real number (ℝ≥0)

ZEROS

Distinct count14
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4053062631302521
Minimum0.0
Maximum19.0
Zeros86472
Zeros (%)71.0%
Memory size952.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum19
Range19
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7213533491
Coefficient of variation (CV)1.779773506
Kurtosis12.4822452
Mean0.4053062631
Median Absolute Deviation (MAD)0
Skewness2.238812197
Sum49389
Variance0.5203506542
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
08647271.0%
 
12343119.2%
 
2102948.4%
 
314301.2%
 
41670.1%
 
534< 0.1%
 
612< 0.1%
 
74< 0.1%
 
144< 0.1%
 
103< 0.1%
 
Other values (4)5< 0.1%
 
ValueCountFrequency (%) 
08647271.0%
 
12343119.2%
 
2102948.4%
 
314301.2%
 
41670.1%
 
ValueCountFrequency (%) 
191< 0.1%
 
144< 0.1%
 
121< 0.1%
 
103< 0.1%
 
91< 0.1%
 

Credit_Amount
Real number (ℝ≥0)

Distinct count4175
Unique (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean59798.86609850971
Minimum4500.0
Maximum405000.0
Zeros0
Zeros (%)0.0%
Memory size952.0 KiB

Quantile statistics

Minimum4500
5-th percentile14220
Q127450
median51750
Q380865
95-th percentile135000
Maximum405000
Range400500
Interquartile range (IQR)53415

Descriptive statistics

Standard deviation39768.99602
Coefficient of variation (CV)0.6650459886
Kurtosis2.054969419
Mean59798.8661
Median Absolute Deviation (MAD)24750
Skewness1.265143742
Sum7286850627
Variance1581573045
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5175037443.1%
 
4500037333.1%
 
6750034502.8%
 
2250031272.6%
 
1800027612.3%
 
2700027482.3%
 
9000023661.9%
 
2547017591.4%
 
5450417261.4%
 
8086515841.3%
 
Other values (4165)9485877.8%
 
ValueCountFrequency (%) 
4500760.1%
 
4797770.1%
 
4945.510< 0.1%
 
495013< 0.1%
 
4975.223< 0.1%
 
ValueCountFrequency (%) 
4050003< 0.1%
 
403103.251< 0.1%
 
386001.91< 0.1%
 
329968.81< 0.1%
 
3150006< 0.1%
 

Loan_Annuity
Real number (ℝ≥0)

Distinct count10856
Unique (%)8.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2712.4820012966125
Minimum217.35
Maximum22500.0
Zeros0
Zeros (%)0.0%
Memory size952.0 KiB

Quantile statistics

Minimum217.35
5-th percentile900
Q11687.5
median2499.75
Q33407.9625
95-th percentile5324.85
Maximum22500
Range22282.65
Interquartile range (IQR)1720.4625

Descriptive statistics

Standard deviation1432.884876
Coefficient of variation (CV)0.5282559941
Kurtosis9.546879866
Mean2712.482001
Median Absolute Deviation (MAD)841.05
Skewness1.730310386
Sum330532206.8
Variance2053159.068
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2499.7548574.0%
 
90023701.9%
 
135020911.7%
 
6758450.7%
 
1012.57740.6%
 
37806210.5%
 
2621.75700.5%
 
11255660.5%
 
1237.54870.4%
 
3165.34750.4%
 
Other values (10846)10820088.8%
 
ValueCountFrequency (%) 
217.352< 0.1%
 
218.72< 0.1%
 
229.51< 0.1%
 
241.22< 0.1%
 
258.31< 0.1%
 
ValueCountFrequency (%) 
2250014< 0.1%
 
21329.11< 0.1%
 
20821.51< 0.1%
 
20646.452< 0.1%
 
20616.751< 0.1%
 

Accompany_Client
Categorical

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
Alone
97409
Relative
 
15748
Partner
 
4516
Other
 
1758
Kids
 
1334
Other values (2)
 
1091
ValueCountFrequency (%) 
Alone9740979.9%
 
Relative1574812.9%
 
Partner45163.7%
 
Other17581.4%
 
Kids13341.1%
 
Others9870.8%
 
Group1040.1%
 

Length

Max length8
Median length5
Mean length5.458976169
Min length4
Distinct count9
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
Service
61028
Commercial
27764
Retired
21043
Govt Job
 
8303
Other
 
3701
Other values (4)
 
17
ValueCountFrequency (%) 
Service6102850.1%
 
Commercial2776422.8%
 
Retired2104317.3%
 
Govt Job83036.8%
 
Other37013.0%
 
Student8< 0.1%
 
Unemployed6< 0.1%
 
Maternity leave2< 0.1%
 
Businessman1< 0.1%
 

Length

Max length15
Median length7
Mean length7.691233915
Min length5

Client_Education
Categorical

Distinct count6
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
Secondary
83911
Graduation
28819
Graduation dropout
 
3960
Other
 
3645
Junior secondary
 
1455
ValueCountFrequency (%) 
Secondary8391168.9%
 
Graduation2881923.7%
 
Graduation dropout39603.2%
 
Other36453.0%
 
Junior secondary14551.2%
 
Post Grad660.1%
 

Length

Max length18
Median length9
Mean length9.492909664
Min length5
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
M
87349
S
 
17404
D
 
7556
W
 
6074
Other
 
3473
ValueCountFrequency (%) 
M8734971.7%
 
S1740414.3%
 
D75566.2%
 
W60745.0%
 
Other34732.9%
 

Length

Max length5
Median length1
Mean length1.114003414
Min length1

Client_Gender
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
Male
78463
Female
40977
Other
 
2413
XNA
 
3
ValueCountFrequency (%) 
Male7846364.4%
 
Female4097733.6%
 
Other24132.0%
 
XNA3< 0.1%
 

Length

Max length6
Median length4
Mean length4.692325368
Min length3
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
CL
107118
RL
 
11087
Other
 
3651
ValueCountFrequency (%) 
CL10711887.9%
 
RL110879.1%
 
Other36513.0%
 

Length

Max length5
Median length2
Mean length2.089884782
Min length2
Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
Home
104870
Family
 
5783
Municipal
 
4248
Other
 
3687
Rental
 
1816
Other values (2)
 
1452
ValueCountFrequency (%) 
Home10487086.1%
 
Family57834.7%
 
Municipal42483.5%
 
Other36873.0%
 
Rental18161.5%
 
Office10020.8%
 
Shared4500.4%
 

Length

Max length9
Median length4
Mean length4.353113511
Min length4

Population_Region_Relative
Real number (ℝ≥0)

SKEWED

Distinct count100
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.02245252605534401
Minimum0.000533
Maximum100.0
Zeros0
Zeros (%)0.0%
Memory size952.0 KiB

Quantile statistics

Minimum0.000533
5-th percentile0.005002
Q10.010032
median0.01885
Q30.026392
95-th percentile0.04622
Maximum100
Range99.999467
Interquartile range (IQR)0.01636

Descriptive statistics

Standard deviation0.4052712685
Coefficient of variation (CV)18.05014133
Kurtosis60787.35483
Mean0.02245252606
Median Absolute Deviation (MAD)0.008818
Skewness246.4131264
Sum2735.975015
Variance0.164244801
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.0188578136.4%
 
0.0462251174.2%
 
0.03075546393.8%
 
0.02516445163.7%
 
0.03132942573.5%
 
0.02866342273.5%
 
0.03579241593.4%
 
0.01910133142.7%
 
0.07250833032.7%
 
0.02071330452.5%
 
Other values (90)7746663.6%
 
ValueCountFrequency (%) 
0.00053317< 0.1%
 
0.00093812< 0.1%
 
0.0012761940.2%
 
0.001333940.1%
 
0.0014171890.2%
 
ValueCountFrequency (%) 
1002< 0.1%
 
0.07250833032.7%
 
0.0462251174.2%
 
0.03579221081.7%
 
0.03579241593.4%
 

Age_Days
Real number (ℝ≥0)

Distinct count17000
Unique (%)14.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16018.71339942227
Minimum7676.0
Maximum25201.0
Zeros0
Zeros (%)0.0%
Memory size952.0 KiB

Quantile statistics

Minimum7676
5-th percentile9450
Q112512
median15734
Q319544
95-th percentile23181
Maximum25201
Range17525
Interquartile range (IQR)7032

Descriptive statistics

Standard deviation4301.353723
Coefficient of variation (CV)0.2685205494
Kurtosis-0.9859887638
Mean16018.7134
Median Absolute Deviation (MAD)3505
Skewness0.1297887205
Sum1951976340
Variance18501643.85
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1573436303.0%
 
1093622< 0.1%
 
1332721< 0.1%
 
1233421< 0.1%
 
2019321< 0.1%
 
1323121< 0.1%
 
1107320< 0.1%
 
1662220< 0.1%
 
1006520< 0.1%
 
2097220< 0.1%
 
Other values (16990)11804096.9%
 
ValueCountFrequency (%) 
76762< 0.1%
 
76781< 0.1%
 
76791< 0.1%
 
76803< 0.1%
 
76831< 0.1%
 
ValueCountFrequency (%) 
252011< 0.1%
 
252001< 0.1%
 
251972< 0.1%
 
251963< 0.1%
 
251951< 0.1%
 

Employed_Days
Real number (ℝ≥0)

Distinct count9949
Unique (%)8.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65200.30854451156
Minimum0.0
Maximum365243.0
Zeros2
Zeros (%)< 0.1%
Memory size952.0 KiB

Quantile statistics

Minimum0
5-th percentile232
Q1962
median2212
Q35385
95-th percentile365243
Maximum365243
Range365243
Interquartile range (IQR)4423

Descriptive statistics

Standard deviation137314.1891
Coefficient of variation (CV)2.106035879
Kurtosis0.983645745
Mean65200.30854
Median Absolute Deviation (MAD)1539
Skewness1.726867528
Sum7945048798
Variance1.885518654e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3652432109817.3%
 
221236803.0%
 
381690.1%
 
212660.1%
 
230640.1%
 
231610.1%
 
11660< 0.1%
 
19960< 0.1%
 
21659< 0.1%
 
76558< 0.1%
 
Other values (9939)9658179.3%
 
ValueCountFrequency (%) 
02< 0.1%
 
22< 0.1%
 
41< 0.1%
 
61< 0.1%
 
82< 0.1%
 
ValueCountFrequency (%) 
3652432109817.3%
 
175462< 0.1%
 
171701< 0.1%
 
171391< 0.1%
 
166782< 0.1%
 

Registration_Days
Real number (ℝ≥0)

Distinct count14142
Unique (%)11.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4960.794913668592
Minimum0.0
Maximum23738.0
Zeros35
Zeros (%)< 0.1%
Memory size952.0 KiB

Quantile statistics

Minimum0
5-th percentile343
Q12102
median4493
Q37350
95-th percentile11316
Maximum23738
Range23738
Interquartile range (IQR)5248

Descriptive statistics

Standard deviation3462.758841
Coefficient of variation (CV)0.6980249943
Kurtosis-0.2295366379
Mean4960.794914
Median Absolute Deviation (MAD)2608
Skewness0.6119843055
Sum604502625
Variance11990698.79
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
449336513.0%
 
145< 0.1%
 
638< 0.1%
 
438< 0.1%
 
938< 0.1%
 
236< 0.1%
 
035< 0.1%
 
78435< 0.1%
 
51135< 0.1%
 
97333< 0.1%
 
Other values (14132)11787296.7%
 
ValueCountFrequency (%) 
035< 0.1%
 
145< 0.1%
 
236< 0.1%
 
327< 0.1%
 
438< 0.1%
 
ValueCountFrequency (%) 
237382< 0.1%
 
227011< 0.1%
 
218651< 0.1%
 
212491< 0.1%
 
208401< 0.1%
 

ID_Days
Real number (ℝ≥0)

Distinct count5962
Unique (%)4.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2999.9722951680674
Minimum0.0
Maximum7197.0
Zeros6
Zeros (%)< 0.1%
Memory size952.0 KiB

Quantile statistics

Minimum0
5-th percentile385
Q11789
median3242
Q34263
95-th percentile4928
Maximum7197
Range7197
Interquartile range (IQR)2474

Descriptive statistics

Standard deviation1475.314242
Coefficient of variation (CV)0.4917759555
Kurtosis-1.00893715
Mean2999.972295
Median Absolute Deviation (MAD)1143
Skewness-0.3774094507
Sum365564624
Variance2176552.112
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
324260154.9%
 
4053760.1%
 
4032740.1%
 
4375730.1%
 
4312700.1%
 
4144690.1%
 
4250670.1%
 
4619660.1%
 
4404640.1%
 
4193640.1%
 
Other values (5952)11521894.6%
 
ValueCountFrequency (%) 
06< 0.1%
 
122< 0.1%
 
212< 0.1%
 
323< 0.1%
 
418< 0.1%
 
ValueCountFrequency (%) 
71971< 0.1%
 
62742< 0.1%
 
62632< 0.1%
 
62351< 0.1%
 
62331< 0.1%
 

Own_House_Age
Real number (ℝ≥0)

Distinct count55
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.082039456407562
Minimum0.0
Maximum69.0
Zeros859
Zeros (%)0.7%
Memory size952.0 KiB

Quantile statistics

Minimum0
5-th percentile3
Q19
median9
Q39
95-th percentile19
Maximum69
Range69
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7.21505548
Coefficient of variation (CV)0.7156345213
Kurtosis34.50981802
Mean10.08203946
Median Absolute Deviation (MAD)0
Skewness5.186886074
Sum1228557
Variance52.05702558
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
98214867.4%
 
730152.5%
 
325552.1%
 
625252.1%
 
223211.9%
 
823021.9%
 
421751.8%
 
120411.7%
 
1019451.6%
 
1418551.5%
 
Other values (45)1897415.6%
 
ValueCountFrequency (%) 
08590.7%
 
120411.7%
 
223211.9%
 
325552.1%
 
421751.8%
 
ValueCountFrequency (%) 
691< 0.1%
 
653920.3%
 
649740.8%
 
632< 0.1%
 
572< 0.1%
 

Mobile_Tag
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
1
121855
0
 
1
ValueCountFrequency (%) 
1121855> 99.9%
 
01< 0.1%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
0
97424
1
24432
ValueCountFrequency (%) 
09742480.0%
 
12443220.0%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
0
87590
1
34266
ValueCountFrequency (%) 
08759071.9%
 
13426628.1%
 
Distinct count19
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
Other
41435
Laborers
21024
Sales
12136
Core
10611
Managers
 
8099
Other values (14)
28551
ValueCountFrequency (%) 
Other4143534.0%
 
Laborers2102417.3%
 
Sales1213610.0%
 
Core106118.7%
 
Managers80996.6%
 
Drivers71505.9%
 
High skill tech43173.5%
 
Accountants37663.1%
 
Medicine31722.6%
 
Security26832.2%
 
Other values (9)74636.1%
 

Length

Max length18
Median length5
Mean length6.748892135
Min length2

Client_Family_Members
Real number (ℝ≥0)

Distinct count15
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1512769170168067
Minimum1.0
Maximum16.0
Zeros0
Zeros (%)0.0%
Memory size952.0 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile4
Maximum16
Range15
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9038706554
Coefficient of variation (CV)0.4201554194
Kurtosis3.212538514
Mean2.151276917
Median Absolute Deviation (MAD)0
Skewness1.053342754
Sum262146
Variance0.8169821617
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
26406252.6%
 
12621321.5%
 
32043416.8%
 
495837.9%
 
513491.1%
 
61570.1%
 
732< 0.1%
 
811< 0.1%
 
94< 0.1%
 
123< 0.1%
 
Other values (5)8< 0.1%
 
ValueCountFrequency (%) 
12621321.5%
 
26406252.6%
 
32043416.8%
 
495837.9%
 
513491.1%
 
ValueCountFrequency (%) 
162< 0.1%
 
151< 0.1%
 
141< 0.1%
 
131< 0.1%
 
123< 0.1%
 
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
2
91358
3
 
17043
1
 
13455
ValueCountFrequency (%) 
29135875.0%
 
31704314.0%
 
11345511.0%
 

Length

Max length3
Median length3
Mean length3
Min length3

Application_Process_Day
Real number (ℝ≥0)

ZEROS

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.1565536370798317
Minimum0.0
Maximum6.0
Zeros6287
Zeros (%)5.2%
Memory size952.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q35
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.741575091
Coefficient of variation (CV)0.5517330898
Kurtosis-1.053197706
Mean3.156553637
Median Absolute Deviation (MAD)1
Skewness0.01327148243
Sum384645
Variance3.033083798
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
32254418.5%
 
22090717.2%
 
11971216.2%
 
41966816.1%
 
51961316.1%
 
61312510.8%
 
062875.2%
 
ValueCountFrequency (%) 
062875.2%
 
11971216.2%
 
22090717.2%
 
32254418.5%
 
41966816.1%
 
ValueCountFrequency (%) 
61312510.8%
 
51961316.1%
 
41966816.1%
 
32254418.5%
 
22090717.2%
 

Application_Process_Hour
Real number (ℝ≥0)

Distinct count24
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.061203387605042
Minimum0.0
Maximum23.0
Zeros26
Zeros (%)< 0.1%
Memory size952.0 KiB

Quantile statistics

Minimum0
5-th percentile7
Q110
median12
Q314
95-th percentile17
Maximum23
Range23
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.231026936
Coefficient of variation (CV)0.2678859507
Kurtosis-0.09479097959
Mean12.06120339
Median Absolute Deviation (MAD)2
Skewness-0.03300076315
Sum1469730
Variance10.43953506
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
121664013.7%
 
101446511.9%
 
111441311.8%
 
13117659.7%
 
14107028.8%
 
9105258.6%
 
1596147.9%
 
1677396.4%
 
1758434.8%
 
858214.8%
 
Other values (14)1432911.8%
 
ValueCountFrequency (%) 
026< 0.1%
 
128< 0.1%
 
21120.1%
 
35060.4%
 
48540.7%
 
ValueCountFrequency (%) 
2314< 0.1%
 
22670.1%
 
211640.1%
 
204940.4%
 
1914641.2%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
Yes
112454
No
 
9402
ValueCountFrequency (%) 
Yes11245492.3%
 
No94027.7%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
Yes
100015
No
21841
ValueCountFrequency (%) 
Yes10001582.1%
 
No2184117.9%
 

Type_Organization
Categorical

HIGH CARDINALITY

Distinct count58
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
Business Entity Type 3
26279
XNA
21085
Self-employed
14725
Other
9899
Medicine
 
4320
Other values (53)
45548
ValueCountFrequency (%) 
Business Entity Type 32627921.6%
 
XNA2108517.3%
 
Self-employed1472512.1%
 
Other98998.1%
 
Medicine43203.5%
 
Business Entity Type 241263.4%
 
Government39713.3%
 
School33712.8%
 
Trade: type 729792.4%
 
Kindergarten26862.2%
 
Other values (48)2841523.3%
 

Length

Max length22
Median length13
Mean length12.33328683
Min length3

Phone_Change
Real number (ℝ≥0)

ZEROS

Distinct count3590
Unique (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean955.8787339154412
Minimum0.0
Maximum4185.0
Zeros14555
Zeros (%)11.9%
Memory size952.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1287
median755
Q31550
95-th percentile2510
Maximum4185
Range4185
Interquartile range (IQR)1263

Descriptive statistics

Standard deviation816.2003833
Coefficient of variation (CV)0.853874403
Kurtosis-0.2085030367
Mean955.8787339
Median Absolute Deviation (MAD)602
Skewness0.7479120311
Sum116479559
Variance666183.0657
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
01455511.9%
 
75536973.0%
 
111040.9%
 
29160.8%
 
36450.5%
 
45240.4%
 
53220.3%
 
62160.2%
 
71790.1%
 
81180.1%
 
Other values (3580)9958081.7%
 
ValueCountFrequency (%) 
01455511.9%
 
111040.9%
 
29160.8%
 
36450.5%
 
45240.4%
 
ValueCountFrequency (%) 
41851< 0.1%
 
41531< 0.1%
 
41282< 0.1%
 
41211< 0.1%
 
40922< 0.1%
 

Credit_Bureau
Real number (ℝ≥0)

ZEROS

Distinct count21
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.75550649947479
Minimum0.0
Maximum22.0
Zeros28003
Zeros (%)23.0%
Memory size952.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q33
95-th percentile5
Maximum22
Range22
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.744052786
Coefficient of variation (CV)0.9934755503
Kurtosis3.122806111
Mean1.755506499
Median Absolute Deviation (MAD)1
Skewness1.512538128
Sum213919
Variance3.041720119
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
14311235.4%
 
02800323.0%
 
21960616.1%
 
31310210.8%
 
479786.5%
 
546713.8%
 
626602.2%
 
714211.2%
 
88320.7%
 
94220.3%
 
Other values (11)49< 0.1%
 
ValueCountFrequency (%) 
02800323.0%
 
14311235.4%
 
21960616.1%
 
31310210.8%
 
479786.5%
 
ValueCountFrequency (%) 
221< 0.1%
 
212< 0.1%
 
195< 0.1%
 
172< 0.1%
 
161< 0.1%
 

Default
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size952.0 KiB
0
112011
1
 
9845
ValueCountFrequency (%) 
011201191.9%
 
198458.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

Client_IncomeCar_OwnedBike_OwnedActive_LoanHouse_OwnChild_CountCredit_AmountLoan_AnnuityAccompany_ClientClient_Income_TypeClient_EducationClient_Marital_StatusClient_GenderLoan_Contract_TypeClient_Housing_TypePopulation_Region_RelativeAge_DaysEmployed_DaysRegistration_DaysID_DaysOwn_House_AgeMobile_TagHomephone_TagWorkphone_WorkingClient_OccupationClient_Family_MembersCleint_City_RatingApplication_Process_DayApplication_Process_HourClient_Permanent_Match_TagClient_Contact_Work_TagType_OrganizationPhone_ChangeCredit_BureauDefault
06750.00.00.01.00.00.061190.553416.85AloneCommercialSecondaryMMaleCLHome0.02866313957.01062.06123.0383.09.01.01.00.0Sales2.02.06.017.0YesYesSelf-employed63.01.00
120250.01.00.01.01.00.015282.001826.55AloneServiceGraduationMMaleCLHome0.00857514162.04129.07833.021.00.01.00.01.0Other2.02.03.010.0YesYesGovernment755.01.00
218000.00.00.01.00.01.059527.352788.20AloneServiceGraduation dropoutWMaleCLFamily0.02280016790.05102.04493.0331.09.01.00.00.0Realty agents2.02.04.012.0YesYesSelf-employed277.00.00
315750.00.00.01.01.00.053870.402295.45AloneRetiredSecondaryMMaleCLHome0.01055623195.0365243.04493.0775.09.01.00.00.0Other2.03.02.015.0YesYesXNA1700.03.00
433750.01.00.01.00.02.0133988.403547.35AloneCommercialSecondaryMFemaleCLHome0.02071311366.02977.05516.04043.06.01.00.00.0Laborers4.01.03.012.0YesYesBusiness Entity Type 3674.01.00
511250.00.01.01.01.01.013752.00653.85AloneServiceSecondaryWFemaleCLHome0.01910113881.01184.03910.03910.09.01.00.00.0Laborers2.02.02.010.0YesYesOther739.00.00
615750.01.01.00.01.00.0128835.003779.55AloneRetiredSecondarySMaleCLHome0.01661221323.0365243.0113.04855.010.01.00.00.0Other1.02.03.014.0YesYesXNA0.03.00
713500.00.00.01.01.00.060415.203097.80AloneRetiredSecondaryMMaleCLHome0.00917522493.0365243.012617.05280.09.01.00.01.0Other2.02.04.015.0YesYesXNA1687.04.00
813500.01.01.00.01.01.045000.001200.15RelativeCommercialGraduationMFemaleCLHome0.00600815734.07889.05455.02665.014.01.00.01.0Sales3.02.04.013.0YesYesSelf-employed1611.00.00
912150.00.00.00.01.00.016320.151294.65AloneRetiredSecondaryWMaleCLHome0.01661220507.0365243.02834.04053.09.01.00.00.0Other1.02.03.09.0YesYesXNA533.05.00

Last rows

Client_IncomeCar_OwnedBike_OwnedActive_LoanHouse_OwnChild_CountCredit_AmountLoan_AnnuityAccompany_ClientClient_Income_TypeClient_EducationClient_Marital_StatusClient_GenderLoan_Contract_TypeClient_Housing_TypePopulation_Region_RelativeAge_DaysEmployed_DaysRegistration_DaysID_DaysOwn_House_AgeMobile_TagHomephone_TagWorkphone_WorkingClient_OccupationClient_Family_MembersCleint_City_RatingApplication_Process_DayApplication_Process_HourClient_Permanent_Match_TagClient_Contact_Work_TagType_OrganizationPhone_ChangeCredit_BureauDefault
12184612150.00.01.00.01.00.025470.001462.05AloneRetiredGraduation dropoutSMaleCLHome0.02516424123.0365243.09523.0795.09.01.00.00.0Other1.02.03.09.0YesYesXNA0.00.00
12184715750.01.00.01.01.00.026128.801283.85AloneCommercialOtherMMaleCLHome0.01885014025.01107.0507.04514.024.01.00.00.0Managers2.01.05.09.0YesYesBusiness Entity Type 31175.01.00
12184818000.01.01.00.00.01.027302.402169.90AloneServiceSecondaryMFemaleCLHome0.03579211073.01521.04883.03602.023.01.00.00.0Sales3.02.02.014.0YesYesHousing1718.02.00
12184910350.00.01.00.00.00.018792.901736.55AloneServiceGraduation dropoutSMaleCLMunicipal0.0100329204.0763.03773.01874.09.01.00.01.0Sales1.02.03.011.0YesYesSelf-employed774.01.00
12185012150.00.00.01.00.00.078192.002383.65AloneRetiredSecondarySMaleCLHome0.01885023943.0365243.01213.04011.09.01.00.00.0Other1.02.02.011.0YesYesXNA1581.02.00
12185129250.00.00.00.01.00.0107820.003165.30RelativeServiceSecondaryMFemaleCLHome0.03132912889.02863.02661.02943.09.01.00.00.0Laborers2.02.04.016.0YesNoBusiness Entity Type 20.01.01
12185215750.00.01.01.00.00.0104256.003388.05AloneCommercialGraduationMFemaleCLHome0.0182098648.0636.0902.01209.09.01.01.00.0Sales2.03.04.012.0YesYesSelf-employed4.00.00
1218538100.00.01.00.01.01.055107.902989.35AloneGovt JobSecondaryMMaleCLHome0.0080689152.01623.03980.0353.09.01.00.00.0High skill tech3.03.05.011.0NoNoTrade: type 60.01.00
12185438250.01.01.00.01.00.045000.002719.35AloneServiceGraduationMFemaleCLHome0.02866310290.0847.0895.02902.04.01.00.00.0Sales2.02.01.012.0YesYesBusiness Entity Type 30.02.00
1218559000.01.01.01.01.01.062428.954201.65AloneCommercialSecondarySMaleCLHome0.01802914772.0498.08679.05025.06.01.00.00.0Managers2.03.04.06.0YesYesBusiness Entity Type 3805.00.00

Duplicate rows

Most frequent

Client_IncomeCar_OwnedBike_OwnedActive_LoanHouse_OwnChild_CountCredit_AmountLoan_AnnuityAccompany_ClientClient_Income_TypeClient_EducationClient_Marital_StatusClient_GenderLoan_Contract_TypeClient_Housing_TypePopulation_Region_RelativeAge_DaysEmployed_DaysRegistration_DaysID_DaysOwn_House_AgeMobile_TagHomephone_TagWorkphone_WorkingClient_OccupationClient_Family_MembersCleint_City_RatingApplication_Process_DayApplication_Process_HourClient_Permanent_Match_TagClient_Contact_Work_TagType_OrganizationPhone_ChangeCredit_BureauDefaultcount
02700.00.00.01.01.00.09594.001020.15AloneServiceSecondaryMMaleCLHome0.02866319128.01323.04646.02670.09.01.01.00.0Security2.02.05.07.0YesNoAgriculture3.01.002
12700.01.01.01.00.00.076022.553233.70AloneRetiredSecondaryMMaleCLHome0.01910120923.0365243.02281.04289.07.01.00.00.0Other2.02.01.011.0YesYesXNA621.02.002
23015.00.01.00.01.00.028856.251669.50AloneRetiredJunior secondaryMFemaleCLHome0.03132923770.0365243.01053.04945.09.01.00.00.0Other2.02.04.07.0YesYesXNA210.02.002
33150.00.00.00.00.00.045000.003044.25AloneGovt JobSecondaryMMaleCLHome0.01802910317.01112.02683.02681.09.01.01.01.0Other2.03.02.015.0YesYesSchool1724.00.002
43150.00.00.00.00.02.010188.00676.35AloneCommercialSecondarySMaleCLMunicipal0.01863411240.0776.010403.03470.09.01.01.00.0Low-skill Laborers3.02.02.09.0YesNoBusiness Entity Type 2448.01.002
53150.00.00.01.00.02.033750.002666.25AloneCommercialSecondaryMMaleCLHome0.02866314280.0179.07385.03102.09.01.00.00.0Cleaning4.02.04.012.0YesYesSchool1810.01.002
63150.00.00.01.01.00.010188.00581.85AloneRetiredSecondarySMaleCLHome0.01968924323.0365243.0180.04037.09.01.00.01.0Other1.02.03.09.0YesYesXNA3254.01.002
73240.00.00.00.01.00.011376.00648.00KidsServiceSecondaryMMaleCLHome0.01885018392.01581.06643.01945.09.01.01.00.0Other2.02.03.012.0YesYesBusiness Entity Type 31047.01.002
83375.00.01.00.00.00.0107820.003165.30AloneRetiredSecondaryMMaleCLHome0.04622021157.0365243.06859.04247.09.01.00.00.0Other2.01.02.017.0YesYesXNA785.02.002
93600.00.00.01.01.00.016789.501635.30AloneRetiredSecondaryMMaleCLHome0.03132921705.0365243.010011.03925.09.01.00.00.0Other2.02.06.011.0YesYesXNA1749.01.002